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ABSTRACT 

In a multiple (or multivariate) regression model 
where the predictors are subject to errors of measurement with a 
known variance-covariance structure, two- sample hypotheses are 
formulated for (i) equality of regressions on true scores and (ii). 
equality of residual variance (or covariance matrices) after 
regression on true scores. The hypotheses are tested using a 
large-sample procedure based on maximum likelihood estimators. 
Formulas for the test statistic are presented; these may be avoided 
in practice by using a general purpose computer program. The 
procedure has been applied to a comparison of learning in high 
schools using achievement test data. (Author) 
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COMPARING REGI^SSIONS WHEN MEASUREMENT ERROR VARIANCES ARE KNOWN 

Abstract 

In a multiple (or multivariate) regression model where the predictors 
are subject to errors of measurement with a known variance-covariance 
structure, two-sample hypotheses are formulated for (i) equality of regres- 
sions on true scores and (ii) equality of residual variances (or covari- 
ance matrices) after regression on true scores. The hypotheses are tested 
using a large-sample procedure based on maximum likelihood estimators. 
Formulas for the test statistic are presented; these may be avoided in 
practice by using a general purpose comput program* The procedure has 
been applied to a comparison of learning in high schools using achievement 
test data. 



COMPARING REGRESSIONS WiEN MEASUREMENT ERROR VARICES ARE MOVm 

1. Introduction 



Often we v/ant to compare two groups of subjects on the basis of a 
posttest^ with adjustment made for scores on a pretest. For this purpose 
many experimenters have used the analysis of covariance. V/hen the observed 
pretest scores contain errors of measurement^ however^ v^e vfould really like 
to make oui' adjustment in tems of true scores [Lord 3c Novick, 1968]^ since 
othervfise "the covariance adjustment does not properly correct for the bias 
in the difference of adjusted means as an estimate of the difference in 
intercepts [Cochran^ 1968^ p* 653]* Because true scores are unobservable, 
one needs extra information to make a satisfactory correction. Lord 
[196OI has pi^oposed a method of doing this when duplicate measurements 
(v/ith independent errors) on an individual's score are available. 

Another problem frequently encountered is that the v/ithin-group 
regressions may not be parallel* In the case of one pretest variable^ 
the regression lines may cross so that the subjects in one group may 
score higher or lower on the posttest than thor- in the other group^ de- 
pending on the pretest score. Cronbach and Gleser [19^5; PP* 17T-l8l] 
give some examples of this^ and -ohnson and Jackson [1959> PP* k2ktf] 
describe a method (the J^hnson-IIeyman tochnique) for ascertaining statis- 
tical significance in this ca.sc-. 

In this article v;e propose a method if testing for equality of regres- 
sions in two groups when all variables in the regx^ession equation are true 
scores. The end is simila}' to that achieved by Lord [X960]^ but we allbv; 
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the regression slopes to be different. Instead of assuming that duplicate 
measurements exist, we assume that the pretest measurement error variance- 
covariance matrix is knovm and the same for both groups, and that the post- 
test measurement error covariance matrix is the same for both groups* 
Either the pretest score or the posbbest score, or both, may be multivari- 
ate. The case where both are univariate is described in Stroud [1972]. 

The study which motivated the methodology^ described here involved a 
desire to compare the learning taking place in two groups of schools, where 
the lov,^ Tests of Educational Development (ITED) were administered in grade 
9 (pretests) and the Tests of AcademiQ Progress (TAP) were administered 
in grade 11 (posttests). It was felt this could be best achieved by testing 
the null hypothesis that the true score regressions were the same for both 
groups. The formulation described here should be applicable in many situa- 
tions involving tests or measurements in which multivariate noimality rnay be 
assumed and measurement error variances and covariances of the predictor 
variables ai^e knovm, as is the case v;ith tocts such as the ITED if one is 
vfilling to use the publisher's fig^^-es for standaid errors of measurement 
and assume the subtest measiu^oment errors luicorrelatcd. 

For subjects in the first r^ronh, let us v/rite, using the classical 
test theory model, the vector equation 7 - T ^ E (observed score true 
score ^ error of rr.easurenonl), v.'hich we jjartltion as 
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Hove y.^ is i;he vector of pretest scorns for an individual, Xp the cor- 
responding vector of posttest scores (not necessarily of the same dimension). 



and , , , and are the corresponding true score and error 

vectors. We assume zero correlation between T and E and between E^ 
and Eg . For subjects in iiie second group^ let the corresponding equation 
be Y = U + F , again partitioned into pretest and posttest vectors. 

We are interested in testing the hypothesis that the regression func- 
tion of posttest true score on pretest true score is the same in both 
groups, i.e., 

H^: etTglT^ = t] = eEUglu^ = t] for all vectors t . 

In the school comparison example cited above, one could interpret this 
as meaning that the average "learning" (in some sense) is the same in both 
groups of schools* Another hypothesis of intei^est is that the learning 
is equally uniform in both groups of schools, i.e., that the true score 
residual covarianje matrix is the same in both groups: 

H^: ^[T^It^ = t] ^['^a'^l ^'^'^ vectors t , 

where the symbol denotes covariance matrix. We may be interested in 
either or independently of the other; so they are treated 

separately. 

In Section 2 v/e express hypotheses H^^ and Hg in texnns of the param- 
eters of the obse^^ed scores X and Y , assuming multivariate normality 
and that the measurement errors in the two groups, E and F , have the 
same covariance matrix with the pretest part known. In Section 3 we 



describe how to test these hjf-potheses using Wald's large-sample test pro- 
cedure based on maximum likelihood estimators and a computing algorithm 
[Lord, 1972] which calculates the value of the Wald test statistic without 
requiring a formula for it^ using numerical differentiation. Section k 
contains the results of the above mentioned comparison of schools* 

Finally, an appendix is presented vhich contains a formulation of 
some properties of the V/ald procedure, and formulas for the asymptotic 
covariance matrix of the maximum likelihood estimator of the left-hand 
side of each hypothesis, v/hich v/ould enable computation of the Wald 
statistic without using numerical differentiation. 

2* Formulation of Hypotheses in Terms of Parameters of 
Distribution of Observed Scores 

Assume that one is given m - n muLuaily independent observation vec- 
tors; the first m of the form X = T ^ E ( T and E iudeptandent) and 
the remaining n of the form Y = u ^ F , where X and Y are the only 
quantities observed- Assume all distribu'oions are multivariate normal, 
viz: T - , 11 ^ ^(v/|/*) , ~ ^ and F f(0,A) . X and 

Y are each partitioned inzo a i)rc-test i^art (dim.ension p ) and a posttest 
part (dimension q ), as described in Section 1. This induces a partitioning 
on |a , V , S"^ , , and A , e.g., 

-1.1. ^22 
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V and ^ are represented similarly. and have been assumed 

uncorrelated; so write A = diag(A^,A2) . We assume that S"^ and ^ 
are nonsingular, and that -A^ is known. In practice will usually be 

diagonal; however, we allow both A^ and to be any positive semi- 

definite matrices, provided A^ can be specified. 

In regression problems involving errors of measurement, it is common 
to consider the underlying variables, although random, to be linearly or 
"structurally" related [K&ndall, 1951; Madansky, 1959; Moran, 1971]* We 
take a different approach and consider rather the unconditional joint 
distribution of and (or of and )• We avoid writing 

anj'thing as a linear function of ; we retain T as ^(^,Z*) 
throughout and condition on only to obtain the formulas for e[T2 |t^] 

and ^[T^ |t^] . Taking the formulas for these quantities from Anderson 
[1958, page 29], and may be v/ritten as follows: 



(2.1) H^: 



^2 ^21 11 ^1 ^2 ^2ril ^1 ' 



(2.2) • H^: - %^l>i2 = ^2 - %ni\2 



To write these hypotheses in terms of the parameters of the distri- 
butions of the observed vectors X and Y , note that X and Y are 
normally and independently distributed v.dth mean vectors ji , v and 
covariance iratrices 2 E £^ ^ A and ^ z 'ii* -r respectively. Partition 
E , in the same way as E-^ , ^ ; then (2.l) and (2.2) become, 
respectively, 
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(2.3) H^: \ ^ 
(2.!,) H,: 5^22 " ^2l(^ll " ^^'^2 = V. " ^21^^11 " ^^''*12 
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3. The iiypothesis Testing P rocedure 

The method proposed in this article for testing and • is an 

asjTnptotic test procedure based on unrestricted maxMum likelihood esti- 
mators {im), first described by Wald [19^^31. Consider the general problem 
of testing a vector hypothesis g(u,v,E,^) = 0 vhere (^,2) are the mean 
vector and covariance matrix of one multivariate normal population and 
(v,.;.) are the mean vector and covariance matrix of a second such population. 
Denote sample estimates by and (v,?) respectively, where the 

sample sizes are m and n . Let 

v.^itten out as a col-amn v.ctor, using each off -diagonal component of the 
svrmetric matrices £ and V only once. Then the test statistic is 



(5.1) 



W 



v;here H - S:(.i,v,S,*) is a large-nample approximation to the covariance 

matrix of U , based on the i^artial derivatives of g , and H is defined 

A ^ 

as H(ii,v,2,1^) • 
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The estimates p and v of the population mean vectors are usually 
taken as the sample mean vectors X and Y ; respectively. Z and 1^ are 
the unrestricted maximum-likelihood estimates S^_^(X'^ - X)(X'^ - X)'/^ and 
Z^^^iY'^ - Y)(y'^ - y)^ /n y although one could instead use the unbiased 
versions with denominators of (m - l) and (n - l) ^ the difference 
being unimportant in large samples. 

The statistic W is as^^Tnptotically distributed as central (noncentral) 
chi-square if the hypothesis is true (false). This result^ originally 
proved by V/ald T 19115] under rather strict regularity conditions^ has been 
shov/n by Stroud [1971] to hold under fairly general circumstances^ which 
apply in particular to the problems studied in this article. 

Tc compute the value of W from given data using the computer program 
described in Lord [1972]^ it is necessary to write a FORTRAN function to c^m- 
pute each component of U ^ given p ^ v ^ 2 ^ and . For the two prob- 
lems described here^ U is obtained by writing equations (2.5) and (2.i|) 
with estimates substituted in and with all quantities transposed to the left 
of the equal sign. If H^^ is being tested^ the first pq components of U 

are the components of ^21^^11 " " '^21 " ^1^ ^ "^^^^ ^ 

components of li are given by - ^^^(S^j^ - ^j^) ^^21^^11 " ^ ^^1 

For ii^ y U is obtained from a triangular portion of Spg - ^21^^11 " 
^1' ^1? " ^'22 ^ '^21^^^11 " ^1^ ^'12 ' ''^'^'i^ten as a vector. The reader is 
referred to Stocking and Lord [1975] for a further description of the 
computing procedure. 
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In order that U denote what is intended^ i.e., an approximately nor* 
mal random vector with mean g{\x,v,L,^} ^ both L^^ - and • A^^ 

ought to be positive definite, yet theoretically this can fail. If the 
data turn out to be such that either L^^ - or - A^^ is near 

singularity or has negative roots^ one may regard this as an indication 
that, due to insufficient sample size, the measurement error has swamped 
out the information in the data estimating Z^^ - A^^ or - A^^ • 

Evidence is presented in Stroud [1968, pp. 8O-8I] that with standardized 
achievement test data with p = q = 1 and sample sizes greater than kO, 
E - A (and hence L^^ - A^ ) is almost certain to be positive definite. 
For the multiple regression case, perhaps larger sample sizes than this 
would be necessary to ensure that the anomaly would not occur. 

The necessary formulas for computing W without using the Lord 

algorithm are given in the appendix* In this case one has the additional 

/\ /\ ^ ^ 
work of substituting the estirrates ji , v , E , and ^ into rather 

complicated formulas for the asjonptotic covariance matrices. The computa- 
tion of V/ has been carried out on the data described in the next section 
both using the Lord algorithm and using the asymptotic covariance matrix 
formulas. The results agree precisely to at least the number of significant 
digits retained in Table 1* 

Analysis of Some Achievement Test Battery Scores 

In this section we report the results of the study, mentioned in the 
introduction, based on grade 9 ITED ( and Y^) and grade 11 TAP 
( Xg and Yg ) subtest scores. The comparison was between one specified 
school in Portland, Oregon and the group consisting of the other 11 high 
schools in that city. 



-9- 

A version of this study in vhieh ^ , , and were uni- 

variate (composite scores) has already been reported [Stroud, 1972]. The 
portion of theje results concerning the tests of hypotheses and Hg 

are included here for purposes of comparison with the multivariate tests* 

The TAP subtests used were Composition, Reading, and Mathematics, in 
that order, and the ITED subtests were Social Concepts, Correctness of 
Expression, Quantitative Thinking, Reading (Social Studies), and Reading 
(Natural Sciences). 

First the procedures were applied using as X^ and X^ the standard 
scores of all the girls in the specified school (School l) and for and 
Yg the scores of the girls in the rest of the city (Schools 2-12) taken 
together. Secondly, the same procedure is repeated for the boys. In the 
third and final application, the girls in the twelve schools (X^^yi^) are 
compared with the boys (Y^,Y2) • Table 1 shows the value of W for com- 
paring (i) conditional mean vectors (hypothesis H^) and (ii) conditional 
covariance matrices (hypothesis H^) for both the multivariate data just 
described and the univariate case utilizing ccanposite scores (averages over 
the subtests). Beside the value of VI is given the corresponding value of 
P , the inverse of the approximating chi-square cumulative distribution func- 
tion ( P = 1 - significance level attained )* The sample sizes are shown in 
Table 2. 

The standard errors of measurement for the ITED subtests were taken 
to be 5*52, 5«l6, 5*^6, 5*32, and 3*^6, respectively. These values were 
derived from the adminij^trator* s manual [Science Research Associates, 
1965], and are expressed in the appropriate scaling of the Portland standardiza 
tion. Subtest measurement errors were assumed to be uncorrelated with each 
other. 



Insert Table 1 about here 



It is seen from che significance levels in Table 1 that t lu 
of the univariate and multivariate applications do not completely corres- 
pond with each other. Notice that in the school-versus-school comparisons 
(boys and girls separately) the multivariate test for covariance matrices 
reveals greater significance than the corresponding univariate test, but 
the multivariate test for mean vectors shov/s leds significance. In the 
boy -versus -girl comparison, he ^ever, the pattern is reversed. 



Insert Taole 2 about here 



In trying to account for these phenomena, ve may note from Table 2, 
where the six 8-dimensional mean vectors are tabulated, that s;ubtest scores 
for School 1 are consistently lower than those of the 2-12 group. However, 
if we compare boys and girls, we find that boys do better in some subtests 
(notably the quantitative) whereas girls do better in others (e.g., 
composition). Thus the composite scores used in the p = q = 1 analysis 
are appropriate for comparing schools, but not for comparing sexes because 
the sex-related differences will tend to be reduced in the averaging of 
subtest scores. In the school-versus -school comparison of mean vectors, 
most of the meaningful variation has been recorded in the composite score 
analysis; so that for example a chi-square of 11.92 on 2 degrees of freedom 
is registered as being more significant than a chi-square of 29. 9^^ on l8 
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degre. i of freedom, even though the difference of the chi-squares (18.O2) 
slightly exe<^ ''h- difference in the degrees of freedom. 

In the boy-versus-girl comparison, on the other hand, the multi- 
variate analysis of mean vectors gives a more impressive result than the 
univariate for the reason (mentioned above) that the simple average is far 
from being the best linear combination demonstrating differences between 
boys and girls* 

The iaterpretation of the results regarding conditional variances and 
CQvariance matrices is more difficult* The main factor is probably that 
residual variances based on predicting a single variable by a single 
variable cannot be expected to resemble too closely a 3x3 residual 
covariance matrix based on five predictors. One would guess that, with the 
school-vers us -school comparisons, there are discrepancies in the residual 

:)variance matrices which are v/ashed out when we look at just the residual 
variance of the composite score. Regarding the boy-versus-girl comparison, 
an examination of the data has revealed that the difference between residual 
variances in the univariate analysis exceeds (in relative terms as well as 
absolute) the difference in residual variances of any of the three subtest 
scores in the multivariate results. This may very well be related to the 
interaction between sex and subtest content, but the pattern appears too 
complicated to give a d* tailed account here. 

In conclusion, it vould appear that the univariate and multivariate 
analyses taken together ai? more informative than either one would be alone. 
Although the techniques used to study the above data have been derived from 
the theory of inference, the author has the distinct impression -'that an 
honest attempt to get as good a feel for the data as possible is more 
fruitful than the making of statistical decisions such as accepting 
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or rejecting hypotheses. This supports the view of Dempster [1969] that 
the data-analytic approach to multivariate problems is often more sensible 
than a **solution*' based on inference. 
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Al. A Simplifying Transformation of the Parameter Space 



The statistic W defined by (5'l) for testing either or may 

be computed as described in Section This process involves obtaining H 
by numerical differentiation of the function g(^,v,S,1^) at the points in 
the parameter space specified by the estimates p , v , 2 , and t • We 
will now proceed' to derive the analytic form of H . This yields an alter- 
native method of computing W , in addition to providing a base for possible 
study of the properties of W . 

V/e assume in the following derivations that the pretest measurement 
error covariance matrix is nonsingular. This assumption is not essential, 
but it makes possible a transformation which simplifies some of the terms 
in the matrix expressions. The derivations of formulas for the more 
general case may be carried out with straightforward modifications. 

The transformation is that of rescaiing the pretest variables to unit 
error of measurement, and is carried out as follows. Assume is 
nonsingular; then define T , U , E , and F as transformed values of 



T , U , E , and F , respectively, after premultiplication by the matri:: 

0 ..... ~ 

r = , e.g., T = TT . Let X T + E and Y = U + F ; then X = rX 

^ . . . . X 

and Y = FY . Then T - 9({\1,Tj^) and U - '?/(v,t*) , where = A^jl^ , 

^2" ^2 ' ^21 " ^ll^i ' ^11 " ^l^llK ' " ^2 ^ ^^^^^ similar 

identities for the v and t quantities. Clearly the hypoth<!ses (2.1) 

and (2.2) are unchanged when ^ , v , , and ^ are replaced by p , 
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V ^ , and ^ , respectively* Let us therefore work with the 

-quantities, forgetting the old ones, and the - syrribol need be retained 
no longer since it is unnecessary* However, note that now E and F each 

[l 0" 



have the covariance matrix TAT = 



• The mean vectors and covarxance 



I? ^ 

matrices of X and Y are {[x,!.) and (v,*) , respectively. The hypothe- 



ses now read 



(Al) i:^: 



^21(^11 - - *2l(^l - ^>"' = 0 



^2 



- ^21(^11 - ^^"'►^i - ^2 ' *2i(*ii - = ° 

(A2) H^: E22 - ^21(^11 - ^^'^2 - *22 ' *2l(*ll " ^>"'*12 = ° ' 

In practice one carries out the transformation simply by dividing each 
pretest variable at the outset by its standard error of measurement; then 
one tests the hypothesis defined by (Al) and (A2)* 

A 2* General Results for Asymptotic Covariance Matrices 

We nov/ state some general results which apply to problems of testing 
hypotheses concerning samples from two norraal distributions* Let the 
hypothesis to be tested be g([i,v,E,^) = 0 and let the test statistic 
be W = U*;S"'^U , as defined in (3*1)* Consider m observations of 
X ^ and n observations of Y ^V,"^) , all independent. We 

are concerned with asymptotic results which hold when m and n in- 
crease such that n/m-^p as n-^00, where 0 < p < <» . Recall that 
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H is an asymptotic approximation to the covariance matrix of U ^ and 
hence is of order n ^ (under regularity conditions )• We define ft - nH ^ 
where ft is a function of u , v , L , t which does not depend on 

n ♦ Then W = nU'ft^'Sj , where ft is obtained from ft by replacing u , 

/\ /\ ^ ^ 
V , L , and hy \x , v , 2 , and ^1^ , respectively. It is shown 

in Stroud [1971] that if g is twice differentiable with matrix of first 

•derivatives of full rank then the distribution of W is asymptotically 

central (noncentral) chi-square when g(u^v^L^^) is zero (nonzero), and 

that the components of ft are given by 

(A3) a)(a,3) = pa^Zp^ + a^^lrp^ + 2p tr a^L^L 4- 2 tr a^tp^t , 

where a is defined as gj.(u,v,£,^) (the i -th component of g when 

written as a vector), ^ is defined as g.(u,v,5:,^) , and by a)(a,6) 

J 

is meant the (i,o) -th component of ft . Subscripts in (A3) denote 
partial differentiation; e.g., f^ is the (p + q) -vector of partial 
derivatives of g . with respect to the components of u , and is 
the (p + q) X (p *^ q) symmetric matrix of partial derivatives of 
with respect to the components of S . In the latter case, since Z is 
symmetric, the off -diagonal components of cx^^ include a factor of ^ 
[see Stroud, 1971, formula and Aitken^ 1955], derivatives with 

respect to symmetric matrices are defined this way for simplicity of the 
resulting formulas. 

The final two sections of this appendix are concerned with deriving 
formulas for the a)(a, p) when g is given by H^^ or * Since 
is simpler, it is treated first* 
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A3* Components of H for Testing Equality of Residual Covariance Matrices 

Define II as the function of the parameters representing the left- 
side of (A2); the hypothesis is H = 0 . A typical component of the matrix 
fl = nH is given by (A5), which may be written as 

(Ak) <D. .^j^^ = 2p tr 4.^1.^1^^ + 2 tr rtj^VrtJ^t 

(note that and v are not involved)^ where, for example, is 

the matrix of partial derivatives of the (i, j) -th component of the matrix 

H with respect to the components of E . 

The partial derivatives and Jt,^' are evaluated with the aid of 

matrix differentials and the associated formulas for products d(AB) = 
(dA)B + A(dB) and for inverses dCA""*") = -k''^{dA)A ^ [see, e.g., Deemer 
and Olkin, 1951, results 5A15, 5A15, and 5B5]. If Y = AXB (where all 
capitals denote matrices), the formula 

ay. . 

where E. . consists of a "l" in position (i^ j) and zeros elsewhere 
It) 

[Dwyer & MacPhail, 19^8] is used to evaluate the matrix derivative. In 
case X is symmetric, the formula becomes 




where dy. ./Sx is defined with a factor of p off -diagonal elements. 
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The differential of 11 is now obtained* From (A2) it follows that 

(A5) dn = ^^22-'^2i^hi^y\2-hi^hi-^y''^i2^h^^^ 

-d^22^d*2i('^ir^)"V%(^iri)"X2-%(*ir^)"Xi(^ir^)"'* 



The following partial derivatives are obtained: 



^ = MdlLii = i(E. .+E: .) = m. +E.. ) 



= ^f(^ir^^"Vid^2i(^ir^)"' ^ (^ir^)"Vi/2i(^ir^)"'i 



(a6) 



art 



^^21 



-(E. .+E..)L_ (2,^-1) 
ij ji' 21^ 11 ' 



-1 



= ^^ij^^ji)^2i(^ir^) 



-1 



Remembering that the matriy bn^'^ /dL is evaluated with a factor of 
applied to all off -diagonal components^ we may v/rite it^ using (a6)^ as 



follows, where the q x q symmetric matrix F. . is defined by 

10 
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(A7) 4^ = 



-1 



Since L may be written as 



(2^1-1 )+I 



^21 



12 



it follows from (A7), by straightforward evaluation, that 



(A8) 



tr ^^hrt^k = tr l\ A^^h, 



[cf. Stroud, 1971, foKaula 5.1], where 



(A9) a(1) = ^22-^21(^11-^) "'^12 ' ^2l(^lx - ' 



and a^^^ stands for the (i^k) -th component of A^"^^ . 
(2) 

Define ' in terms of t as in (A9). If we substitute into 
(A^) the formula (a8) and the analogous formula involving ^ and A^^^ 
the follov/ing result is yielded: 
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where p is taken here as equal to n/m • 

In the case q = 1 ^ equals the scalar cjd^^ • U ^ H is also 
a scalar^ so it is easy to write down the following formula for the 
'statistic W : 



W = 

g^^22-^2l(^ll-^^' \2-'^22^'^2l('^ir^) '^12^ 

kk* Testing for Equality of Regressions 

The development of this section parallels that of the preceding 
section^ but the presentation is somewhat more condensed to save space* 
This time the quantity g(^, is the vector of dimension pq + q 

whose components are the components of the q x p matrix 

0 = ^21(^11-1)"' - ■^2i(-^ir^)"' 

and the q ^ 1 vector 
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It is straightforward to show that the matrix of partial derivatives 

of the transformation from (|i,v,2;t) to (4>;A) has full rank of pq + q ^ 
by showing that the Jacob ian of the transformation from (^21^ ^'2^ i^y^) 
is nonzero [see Deemer & Olkin, 1951^ Theorem 5»5]» 

The pq + q components of U are the components 0"^^ ( ^ S — ^ ^ 
1 < j < p) of the matrix 0 and the components "h^ (I'^i^.^) 

/\ A /\ 

vector A when (ji,v,S,t) is replaced by its estimator (u^v^S^t) • 

E'^ch component of fi , as given by (A5), is one of three types^ 
according to whether a and 3 are: 
(i) both components of <^ 
(ii) both components of A 
(ill) one a component of 0 and the other of A • 
Let the components of H of these tyi;f'S be denoted, respectively by 
"^ij,k/ ^^'^ l;.*.,q; j,/ l,...,p) , o).^^ (i,k - l,...,q) , and 
03. (i^k q; / =^ 1, ...^p) • The general foiinula (A5) may be 

rewritten as follows: 



(Alo) ( -i,k - f^K^'^'i ' ^K^'^^^l ' * ' ^ 

i,k/ Mi u ^ V V S A. * \1/ 

where i,k = 1, •••,q; j,/ > P • The next step is to 

obtain formulas for expressions like 4^^"^ , X^^^ > ^nd and then find 

simplified expressions for ilKr terns of the right sides of (A10)» 
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To get 0 y first write the differential of $ : 

2j 



d<D = <i22i(2ii-l)"^-22i(2ii-l)"^<i2ii(2^1-l)"^-dt2i(ti3_-l) 



-1 



Hence, after differentiation with respect to submatrices, incorporating 



the factor of - where necessary, one gets 



E 

(All) 



iE..(Z^^-l) 



-1 



By straightforward calculation and noting (Z^^-l) ^Z^.(Z^^-l) ^ 



11^ 11 



(Z^^-l)"-"- + (^i^-l)'^ ^ we obtain from (All) that 



tr fil^ut^L = i tr E. .C^^^E', /A^""-^ + tr E. .B^^'E, /B 



.(a) 



where, for a = 1,2 , the matiices ' are defined by (A9), and the 
following notation is introduced: 



(1) = (r -xr^r. = (Z^^-i)-^+(Z, , -l)-^ 



B^-'^ = (E^^-I) "2^2 ' ^ 



11 



l2) ('^) 

with B^ ' and C ' defined analogously for t . Hence 



. ij^ k/„ 1/, (1)^(1) ,. b(l)b(>)^ 



and a similar formula holds involving ^ 
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Since 0 does not involve ^ or v ^ the first two terms in the 
formula for o). . / vanish^ and hence 

(i,k = l,...,q; J,/ = 1, •••,?) . 

The following result is obtained for the differential dA : 

dA = ^2-''2i^hi-^yW'^2i^hi^y\' ^2i(^ir^)"''^ii(^iri)"V 
- dv2H2i(i^^^-i)-^dv^+di.2;L^¥^r^)"\-^2i(^iri)"Xi^*ir^>"H 



Using the fact that v;hen E. . contains only one column it may be witten 

It) 

as e^ ^ defined as a column vector with "1" in the i -th position and 
zeros elsewhere, one may obtain 



(A12) Aj = 



■H(^n-^)"'(^i2^i^^^v1^2i)(^iri)''] -^■(^iri)"'^^i4 



and 



(A15) 



■(^ir^)"Vi 



1 



It may be noted that formulas (All) and (A12) are identical^ except that 
in (A12) the matrix (-e^ia') replaces E of (All). Thus (A12) implies 



t 
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iklU) tr }iz>h: =i tr e.niC^^^n p'A^^^ + | tr e.n'B^^^e, n R^^^ 
^ ^ 11 Ik *^ r 1 k 1 

By further calculation it may be seen that relation (A13) implies 

(A15) (T^-yZh^ = elA^l^e, = afj*^ • 

1 k ik 

Substitution of (Al^), (A15) and analogous expressions involving ^ and v 
into (AIO) yields the following formula: 

+ [(v'c(^\+l)a(2) . (v;b(2)).(v'b(2))^3 
(i,k = l,...,q) 

To get the formula for y.^ , it is straightforward to obtain 
k/ k/ 

Then, since 0 * = ^ ^ = 0 , one obtains 

V 

(i,k r l,...,q: / = l,...,p) 
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Expressions for all the components of H =: nH have now been displayed. 

_ _ When and are univariate ( <1 = 1 ^ i.e.^ multiple but not 

multivariate regression)^ one can present a reasonable looking formula for 

n as a bordered matrix, rather than simply a collection of separate 

formulas for components .uch as appears above. Eliminate superscripts and 

write a z A^^^ = af?^ , B = B^^^ , and C = C^^^ for a = 1,2 . The 
a 11 ^ a ^ a ^ 

a are scalars. the B are p x 1 column vectors and the are 
p X p symmetric matrices. Then define 



= a C + B B' 
a a a a a 



(a = 1,2) 



It is then straightforward to write the formula for H , 



m 



nGj^+mGg 



-n^G^-mv'G^ 



If p = 1 as well (simple regression), the formula for 9. may be 
easily written down* The corresponding formula for W =- nlJ'H \j may be 
found in Stroud [1972]. 
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